Search CORE

33 research outputs found

Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning

Author: A Ben-Tal
A Nilim
Alborz Geramifard
CW Anderson
D Bertsimas
DA Castanon
H-L Choi
Jonathan P. How
Joshua Redding
LF Bertuccelli
MG Lagoudakis
O Mihatsch
P Geibel
R Beard
R Olfati-Saber
RI Brafman
RS Sutton
S Berman
S Zhu
W Ren
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2012
Field of study

Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g., linear and deterministic dynamics), yet inaccuracies in assumed models will impact the resulting performance. Learning techniques are capable of adapting the model and providing better policies asymptotically compared to cooperative planners, yet they often violate the safety conditions of the system due to their exploratory nature. Moreover they frequently require an impractically large number of interactions to perform well. This paper introduces the intelligent Cooperative Control Architecture (iCCA) as a framework for combining cooperative planners and reinforcement learning techniques. iCCA improves the policy of the cooperative planner, while reduces the risk and sample complexity of the learner. Empirical results in gridworld and task assignment for fuel-limited UAV domains with problem sizes up to 9 billion state-action pairs verify the advantage of iCCA over pure learning and planning strategies

DSpace@MIT

Crossref

Value Iteration for Simple Stochastic Games: Stopping Criterion and Learning Algorithm

Author: A Condon
AJ Hoffman
C Baier
C Courcoubetis
C Dehnert
C-H Cheng
D Andersson
DA Martin
G Arslan
J Křetínský
K Chatterjee
K Chatterjee
L Busoniu
M Kattenbelt
M Kwiatkowska
M Svorenová
ML Puterman
P Ashok
R Calinescu
RI Brafman
S Haddad
SM LaValle
T Brázdil
T Chen
T Chen
Publication venue
Publication date: 13/04/2018
Field of study

Simple stochastic games can be solved by value iteration (VI), which yields a sequence of under-approximations of the value of the game. This sequence is guaranteed to converge to the value only in the limit. Since no stopping criterion is known, this technique does not provide any guarantees on its results. We provide the first stopping criterion for VI on simple stochastic games. It is achieved by additionally computing a convergent sequence of over-approximations of the value, relying on an analysis of the game graph. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. As another consequence, we can provide a simulation-based asynchronous VI algorithm, which yields the same guarantees, but without necessarily exploring the whole game graph.Comment: CAV201

arXiv.org e-Print Archive

Crossref

Special Agents Can Promote Cooperation in the Population

Author: A Cardillo
A Rubinstein
A Szolnoki
A Szolnoki
A Szolnoki
A Szolnoki
A Szolnoki
A Szolnoki
Attila Szolnoki
B Woelfing
D Greig
D Helbing
D Helbing
DB Fogel
DG Rand
DW Stephens
E Fehr
E Fehr
E Lieberman
F Fu
FC Santos
G Cattaneo
G Szabó
G Szabó
H Ohtsuki
H Ohtsuki
H Ohtsuki
Huawei Han
I Krams
J Han
J Han
JH Miller
Jing Han
K Binmore
K Ogata
K Sigmund
K Sigmund
LG Moyano
M Droz
M Milinski
M Nowak
M Nowak
M Nowak
M Nowak
M Nowak
M Perc
MA Nowak
MA Nowak
MD Santos
MH Vainstern
MP Lombardo
PJ Darwen
R Axelrod
R Axelrod
R Boyd
R Cressman
RA Hammond
RI Brafman
RL Axtell
RL Riolo
RL Riolo
S Bowles
S Lozano
SV Segbroeck
TW Sandholm
X Wang
X Yao
Xin Wang
Z Chen
Ö Gürerk
Publication venue: Public Library of Science
Publication date: 21/12/2011
Field of study

Cooperation is ubiquitous in our real life but everyone would like to maximize her own profits. How does cooperation occur in the group of self-interested agents without centralized control? Furthermore, in a hostile scenario, for example, cooperation is unlikely to emerge. Is there any mechanism to promote cooperation if populations are given and play rules are not allowed to change? In this paper, numerical experiments show that complete population interaction is unfriendly to cooperation in the finite but end-unknown Repeated Prisoner's Dilemma (RPD). Then a mechanism called soft control is proposed to promote cooperation. According to the basic idea of soft control, a number of special agents are introduced to intervene in the evolution of cooperation. They comply with play rules in the original group so that they are always treated as normal agents. For our purpose, these special agents have their own strategies and share knowledge. The capability of the mechanism is studied under different settings. We find that soft control can promote cooperation and is robust to noise. Meanwhile simulation results demonstrate the applicability of the mechanism in other scenarios. Besides, the analytical proof also illustrates the effectiveness of soft control and validates simulation results. As a way of intervention in collective behaviors, soft control provides a possible direction for the study of reciprocal behaviors

Public Library of Science (PLOS)

Crossref

PubMed Central

Multiagent learning in the presence of memory-bounded agents

Author: AM Sykulski
B Banerjee
C Claus
CJCH Watkins
D Chakraborty
D Fudenberg
D Fudenberg
DH Wolpert
Doran Chakraborty
DP Foster
F Southey
G Brown
H Dyke Parunak Van
J Hannan
JF Nash Jr
K Tuyls
M Bowling
MJ Osborne
MJ Wooldridge
ML Littman
ML Littman
ML Littman
ML Puterman
P Stone
P Stone
Peter Stone
R Aumann
R Powers
RI Brafman
RS Sutton
S Airiau
S Hart
S Singh
V Conitzer
Y Chevaleyre
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The Potential Impact of Intelligent Systems for Mobile Health Self-Management Support: Monte Carlo Simulations of Text Message Support for Medication Adherence

Author: A Boker
AG Bartow
AS Misono
AS Shetty
AW Armstrong
B Fjeldsoe
B Vrijens
C Free
C Pop-Eleches
CA McHorney
CA McHorney
CA McHorney
CB Beggs
DA Hanauer
E Unni
E Vermeire
EJ Unni
H Hardy
J Boger
J Wei
J Yeaw
JC Benneyan
Jeremy Sussman
JN Rasmussen
John D. Piette
K Herttua
K Resnicow
K Schroeder
Karen B. Farris
KJ Petrie
L Osterberg
Larry An
M Lesosky
M Vervloet
MC Garber
P Rudd
R Horne
RA Elliott
RD Furberg
RI Brafman
RP Hawkins
S Krishna
S Murphy
S Singh
S Singh
Satinder Singh
Sean Newman
SJ Woolford
T Mengden
TJ Bramley
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A theory to devise dependable cooperative encounters

Author: B Grosz
M Tambe
O Shehory
R Nissim
RI Brafman
V Lesser
Publication venue: HAL CCSD
Publication date: 01/01/2017
Field of study

International audienceIn this paper, we investigate the question of how to characterize " fault tolerance " in cooperative agents. It is generally admitted that cooperating agents can achieve tasks that they could not achieve without cooperation. Nevertheless, cooperating agents can have " Achilles' heels " , a cooperative encounter can eventually fail to achieve its tasks because of the collapse of a single agent. The contribution of this paper is the study of how cooperating agents are affected by dependability issues. Specifically, our objectives are twofold: to formally define the concepts of dependability in cooperative encounters, and to analyze the computational complexity of devising dependable cooperative encounters

Crossref

Hal - Université Grenoble Alpes

Behavior Matching by Observation for Multi-Robot Cooperation

Author: BKP Horn
E Lynne
M Tistarelli
P Kaenel
PE Agre
RI Brafman
S Rougeaux
Y Kuniyoshi
Y Kuniyoshi
Publication venue: Springer
Publication date: 01/01/1996
Field of study

Based on formal analysis of qualitative robot behaviors, a criterion for classifying multi-robot tasks to either static or dynamic is presented. It suggests that common resource recognition is essential in dynamic cooperative tasks. In the presented framework Cooperation by Observation, a robot finds a common resource and chooses an appropriate helpful behavior by observing other agents' behaviors. Demonstrative experiments with real mobile robots are presented. The robots have actively controlled binocular vision and controlled by an extended behavior based architecture. The vital part of the architecture consists of attentional buffers, which handles behavior coordination by temporarily remembering the common resource and initializing new behaviors using it. 1 Introduction Over the last decade, various cooperative frameworks for decentralized tacit multiple robot systems have been developed, which have made significant academic contributions. However, if we step back and view the wh..

CiteSeerX

Crossref

Integrating TCP-Nets and CSPs: The Constrained TCP-Net (CTCP-Net) Model

Author: AK Mackworth
AK Mackworth
C Boutilier
C Lecoutre
M Mouhoub
R Haralick
RI Brafman
S Sadaoui
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

6D Localization and Kicking for Humanoid Robotic Soccer

Author: GH Golub
H Kitano
J Stoecker
N Shafii
N Snafii
RI Brafman
S Thrun
U Visser
W Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Safety-constrained Reinforcement Learning for MDPs

Author: A Sokolova
C Baier
GD Penna
JP Katoen
K Dräger
K Etessami
L Moura de
M Kwiatkowska
M Pecka
R Sutton
R Wimmer
RI Brafman
V Forejt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Publikationsserver der RWTH Aachen University